AITopics | global optimality

Collaborating Authors

global optimality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

59e02a1440e6667e01628ed4c325255c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 03:53:51 GMT

objective, optimization, probability, (16 more...)

Neural Information Processing Systems

Country:

North America > Cuba > Holguín Province > Holguín (0.04)
Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Global Optimality in Bivariate Gradient-based DAG Learning

Neural Information Processing SystemsDec-24-2025, 16:50:38 GMT

Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order optimization schemes to solve this problem, proving the global optimality of such approaches has proven elusive. The difficulty lies in the fact that unlike other non-convex problems in the literature, this problem is not benign, and possesses multiple spurious solutions that standard approaches can easily get trapped in. In this paper, we prove that a simple path-following optimization scheme globally converges to the global minimum of the population loss in the bivariate setting.

bivariate gradient-based dag learning, global optimality, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

Neural Information Processing SystemsDec-24-2025, 06:27:32 GMT

The generalization mystery of overparametrized deep nets has motivated efforts to understand how gradient descent (GD) converges to low-loss solutions that generalize well. Real-life neural networks are initialized from small random values and trained with cross-entropy loss for classification (unlike the lazy or NTK regime of training where analysis was more successful), and a recent sequence of results (Lyu and Li, 2020; Chizat and Bach, 2020; Ji and Telgarsky, 2020) provide theoretical evidence that GD may converge to the max-margin solution with zero loss, which presumably generalizes well. However, the global optimality of margin is proved only in some settings where neural nets are infinitely or exponentially wide. The current paper is able to establish this global optimality for two-layer Leaky ReLU nets trained with gradient flow on linearly separable and symmetric data, regardless of the width. The analysis also gives some theoretical justification for recent empirical findings (Kalimeris et al., 2019) on the so-called simplicity bias of GD towards linear or other simple classes of solutions, especially early in training. On the pessimistic side, the paper suggests that such results are fragile. A simple data manipulation can make gradient flow converge to a linear classifier with suboptimal margin.

gradient descent, margin maximization and simplicity bias, two-layer net, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Global Optimality of Local Search for Low Rank Matrix Recovery

Neural Information Processing SystemsNov-21-2025, 15:16:09 GMT

We show that there are no spurious local minima in the non-convex factorized parametrization of low-rank matrix recovery from incoherent linear measurements. With noisy measurements we show all local minima are very close to a global optimum. Together with a curvature bound at saddle points, this yields a polynomial time global convergence guarantee for stochastic gradient descent {\em from random initialization}.

global optimality, local search, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)

Add feedback

A novel Information-Driven Strategy for Optimal Regression Assessment

Castro, Benjamín, Ramírez, Camilo, Espinosa, Sebastián, Silva, Jorge F., Orchard, Marcos E., Rozas, Heraldo

arXiv.org Machine LearningOct-22-2025

In Machine Learning (ML), a regression algorithm aims to minimize a loss function based on data. An assessment method in this context seeks to quantify the discrepancy between the optimal response for an input-output system and the estimate produced by a learned predictive model (the student). Evaluating the quality of a learned regressor remains challenging without access to the true data-generating mechanism, as no data-driven assessment method can ensure the achievability of global optimality. This work introduces the Information Teacher, a novel data-driven framework for evaluating regression algorithms with formal performance guarantees to assess global optimality. Our novel approach builds on estimating the Shannon mutual information (MI) between the input variables and the residuals and applies to a broad class of additive noise models. Through numerical experiments, we confirm that the Information Teacher is capable of detecting global optimality, which is aligned with the condition of zero estimation error with respect to the -- inaccessible, in practice -- true model, working as a surrogate measure of the ground truth assessment loss and offering a principled alternative to conventional empirical performance metrics.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2510.14222

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Scalable Distributional Robustness in a Class of Non Convex Optimization with Guarantees

Neural Information Processing SystemsAug-15-2025, 01:03:32 GMT

Mostly, if not always, these works have focussed on the task of classification or regression.

artificial intelligence, machine learning, optimization, (18 more...)

Neural Information Processing Systems

Country:

North America > Cuba > Holguín Province > Holguín (0.04)
Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Action Dependency Graphs for Globally Optimal Coordinated Reinforcement Learning

Ding, Jianglin, Tang, Jingcheng, Jing, Gangshan

arXiv.org Artificial IntelligenceJun-3-2025

Action-dependent individual policies, which incorporate both environmental states and the actions of other agents in decision-making, have emerged as a promising paradigm for achieving global optimality in multi-agent reinforcement learning (MARL). However, the existing literature often adopts auto-regressive action-dependent policies, where each agent's policy depends on the actions of all preceding agents. This formulation incurs substantial computational complexity as the number of agents increases, thereby limiting scalability. In this work, we consider a more generalized class of action-dependent policies, which do not necessarily follow the auto-regressive form. We propose to use the `action dependency graph (ADG)' to model the inter-agent action dependencies. Within the context of MARL problems structured by coordination graphs, we prove that an action-dependent policy with a sparse ADG can achieve global optimality, provided the ADG satisfies specific conditions specified by the coordination graph. Building on this theoretical foundation, we develop a tabular policy iteration algorithm with guaranteed global optimality. Furthermore, we integrate our framework into several SOTA algorithms and conduct experiments in complex environments. The empirical results affirm the robustness and applicability of our approach in more general scenarios, underscoring its potential for broader MARL challenges.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2506.00797

Country: Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Global Optimality in Bivariate Gradient-based DAG Learning

Neural Information Processing SystemsMay-26-2025, 19:57:53 GMT

Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order optimization schemes to solve this problem, proving the global optimality of such approaches has proven elusive. The difficulty lies in the fact that unlike other non-convex problems in the literature, this problem is not "benign", and possesses multiple spurious solutions that standard approaches can easily get trapped in. In this paper, we prove that a simple path-following optimization scheme globally converges to the global minimum of the population loss in the bivariate setting.

artificial intelligence, bivariate gradient-based dag learning, machine learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Building Rome with Convex Optimization

Han, Haoyu, Yang, Heng

arXiv.org Artificial IntelligenceFeb-10-2025

Global bundle adjustment is made easy by depth prediction and convex optimization. We (i) propose a scaled bundle adjustment (SBA) formulation that lifts 2D keypoint measurements to 3D with learned depth, (ii) design an empirically tight convex semidfinite program (SDP) relaxation that solves SBA to certfiable global optimality, (iii) solve the SDP relaxations at extreme scale with Burer-Monteiro factorization and a CUDA-based trust-region Riemannian optimizer (dubbed XM), (iv) build a structure from motion (SfM) pipeline with XM as the optimization engine and show that XM-SfM dominates or compares favorably with existing SfM pipelines in terms of reconstruction quality while being faster, more scalable, and initialization-free.

artificial intelligence, dataset, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

2502.0464

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Greece (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Reviews: Global Optimality of Local Search for Low Rank Matrix Recovery

Neural Information Processing SystemsJan-20-2025, 18:05:11 GMT

This is a nice result. I am going to list a few nits I had about the paper as I read along. I think addressing some of these points would improve the presentation of the paper. There are a few cases which are not covered by the results. For instance, strict-saddle in noisy case local min are close to global in high rank, noisy case. A discussion about why these cases are not covered would be nice; I am assuming that it is not just straightforward modification of the current proof? 2. In practice, I believe that random init gradient descent without noise is sufficient.

global optimality, local search, low rank matrix recovery, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback